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Abstract 

A new mathematical technique for the adaptation of the results of numerical wave 
prediction models to local conditions is proposed in this work. The main aim is to reduce the 
systematic part of the prediction error in the direct model outputs by taking advantage of the 
availability of local measurements in the area of interest. The methodology is based on a 
combination of two different statistical tools: Kolmogorov-Zurbenko (KZ) and Kalman filters. The 
first smoothes appropriately the observation time series as well as that of model direct outputs so 
to be comparable via a Kalman filter. This is not the case in general, since forecasted values are 
smoothed spatially and temporarily by the model itself while observations are point records where 
no smoothing procedure is applied. The direct application of a Kalman filter to such qualitatively 
different series may lead to serious instabilities of the method and discontinuities in the results. 
The proper utilization of KZ-filters turn the two series into a compatible mode and, therefore, 
makes possible the exploitation of Kalman filters for the identification and subtraction of 
systematic errors. The proposed method was tested in an open sea area for significant wave 
height forecasts using the wave model WAM and six buoys as observational stations. 
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1. Introduction 


The need of accurate local wave predictions has seriously increased during the last few 
years due to several affected activities such as ship traffic, tourism, offshore exploration, etc. The 
most reliable tools today towards such forecasts are the numerical wave predictions models. A 
large number of operational and research centers worldwide base their predictions on global or 
regional wave models with rather successful results concerning the general sea state forecast. 
However, if one focuses on specific locations and tries to obtain accurate local wave information, 
serious and systematic divergences are usually revealed. These divergences are mainly due to the 
fact that wave model outputs are strongly dependent on local characteristics, initial conditions, as 
well as the corresponding atmospheric data used as input. On the other hand, numerical models 
cannot simulate successfully sub-grid scale phenomena. Similar drawbacks have been also pointed 
out in numerical forecasting of atmospheric parameters. 

In order to reduce the impact of the abovementioned problems to the final outputs of the 
forecasting systems, a variety of approaches have been employed. One possible way out is to 
increase the model resolution. This may lead to some improvement in the representation of 
smaller scale wave characteristics. However, such a change would also demand the corresponding 
increase of the resolution of the atmospheric model that is used to provide the necessary wind 
input. It would be meaningful otherwise since all wave models used today are wind driven with 
the wind input being the most crucial component. On the other hand, it remains an open question 
whether the use of higher resolution models improves forecast skill or whether potential 
improvement compensates for the increase in computational resources required (see e.g. [16]). 

An alternative option for the improvement of the local forecasts in numerical (wave or 
atmospheric) forecasting is also provided by statistical methods aiming at the local adaptation of 
the direct model outputs. Many of them are derived from Model Output Statistics (MOS), which 


2 



are able to account for local effects and seasonal changes. However, discrepancies have been 


found in such applications in cases of short time local weather changes or updates of the 
numerical model in use (see e.g. [10,15]). 

An alternative approach with excellent results in many previous studies for different 
forecasted parameters is the use of Kalman filtering [1, 3, 6, 7, 11, 12, 13, 17]. The Kalman filter 
consists of a set of mathematical equations that provides an efficient computational solution of 
the least square method. Observations are recursively combined with recent forecasts using 
weights that minimize the corresponding biases. The main advantage of Kalman filters is the easy 
adaptation to any alteration of the observations as well as the fact that only short series of 
background information are needed. However, even by the use of this more dynamic 
methodology a number of problems remain unsolved leading to serious divergences. The main 
reason is that the two time series used as input to Kalman filters, the model forecasts and the 
corresponding observed values, are of different qualitative characteristics. Model outputs are 
always smoothed in time and space having, therefore, a continuous and mild evolution. On the 
other hand, observations are point measurements recorded at discrete times without smoothing 
and are therefore discontinuous and highly variable (see fig. 3). As a result, the direct utilization of 
such time series by a Kalman filter may lead to serious instabilities. Such a case is discussed in 
Section 4 and has been visualized in Figure 4. 

In this work a new methodology is proposed that responds quite successfully to the above 
mentioned difficulties and leads to a considerable improvement of the local wave forecast. It 
consists of a combination of Kolmogorov-Zurbenko (KZ) filters [5, 18] with Kalman algorithms. The 
former is applied to the initial recorded observations and to models direct outputs, smoothing any 
possible high variability and reducing noisy intervals. The percentage of the removed variability 
can be controlled by an appropriate choice of filter parameters. In this way, the above mentioned 
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different quality characteristics of the time series in study are eased. On the other hand, possible 


systematic deviations are clearly revealed. In a second step, the KZ-filtered results are input to a 
Kalman filter, which can be applied smoothly with no instabilities, leading to a very satisfactory 
adaptation of the forecasts to local area's characteristics. 


2. Model Description 

The wave model used in this paper is WAM cycle 4 ([2, 8, 10, 14, 19]) developed in the 
European Centre for Medium-Range Weather Forecasts (ECMWF). WAM is a third generation 
wave model which solves the wave transport equation explicitly without any assumptions on the 
shape of the wave spectrum. It represents the physics of wave evolution in accordance with our 
knowledge today for the full set of degrees of freedom of a 2 dimensional wave spectrum. 

The first statistical tool employed is a Kolmogorov-Zurbenko (KZ) filter. A detailed 
presentation of the philosophy and the way of using such type of filters can be found in [5, 18]. 
They are based on iterative moving averages and are able to remove high frequency variations 

from the initial data. To be more precise, if we denote by (x®),. the initial values of a series, the 
first iteration of the filter smoothes them as follows: 


X, =• 


i=—n 


2^ + 1 j=-q 


i+j 


( 1 ) 


Here, parameter q designates the length of the filter window which is m=2q+l. In the next step, 

1 ^ 

these values (x'),. become the input for the second iteration: xf = - V x' ., and so on. The 

2a+ \ 


j=-q 


parameter m and the number of iterations (n) control the portion of the variability that one wants 




to exclude. In particular, the desired separating frequency is ^ ^ 
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It worth to notice that, in the present work, the KZ-filter is not utilized within the forecasting 


period. It is only applied to past observations and model results in order to ease possible 
qualitative differences and transform them to a comparable mode. 

The KZ-smoothed time series form the initial data for Kalman filtering. In order to make the 
paper as self-contained as possible, a detailed description of the general form of the Kalman filter 
algorithm, based on the unified notation proposed by [9], follows. 

Kalman filters simulate the evolution in time of an unknown process (state vector), whose 
"true" value at time t, is denoted here by x‘(^.). This is combined with a corresponding known 

array (observations) yf’ which refers to the same time. The change of x in time is governed by the 

system equation: 

x‘ (t ,) = [x‘ )] + ) (2) 

The observation equation describes the relation between the observation vector and the unknown 
one: 

/=//,[x*(f,)] + f,. (3) 

The matrices M. (system operator), (observation operator) as well as the covariance matrices 
Q(t/), R{ti) of the Gaussian (by assumption) and independent random vectors , e^, 
respectively, have to be determined before the application of the filter. 

The first forecast step of the state vector x and its error covariance matrix P is given by: 

(4a) 

+Q(t,_,). (4b) 

This is followed up by an update (analysis) step in which the observation available at time f, is 
combined with the previous information: 

x“ (?,) = (/,) + K, {yf - //, [x^ (/,)]), (5a) 
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r{t,) = {I-K,H,)V\t,) 


(5b) 


Here 

+R,r ( 6 ) 

is the Kalman gain that arranges how easily the filter adjusts to possible new conditions. The 
superscripts o, t, f, a denote observations, true, forecast and analysis value correspondingly. 
Moreover, T and -1 stand for the transpose and the inverse matrices, respectively, while / is the 
unitary matrix. Equations (2)-(6) update the Kalman algorithm from time t/.^ to t,. 

3. The case studied 

For the present work, a global version of the WAM model was utilized. The wave spectrum 
was descritized in bands of 30 frequencies and 24 directions. The first integration frequency was 
determined to 0.0417 Hz and the propagation time step to 300 seconds. The model ran in a deep 
water mode with no refraction. WAM was driven by NCEP/GFS model wind data with horizontal 
grid resolution 1.0x1.0 degree. 

The area of study was that of southwest cost of United States as presented in fig. 1. In the 
same map, the locations of the buoys used as observational sources are indicated. All of them 
belong to NOAA/National Data Buoy Center network and their exact positions in Lat-Lon 
coordinates are declared in Table 1. It should be noted that, since these locations do not coincide 
with WAM grid points, the corresponding forecasts have been interpolated to them. 

***** Desired Location of Figure 1 ******** 

***** Desired Location of Table 1 ******** 
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A KZ(m=5,n=5) filter was employed which is equivalent to a cutoff frequency of 0.0411 or 
24.3 time steps (see the corresponding criterion presented in Section 2 as well as the relevant 
references [5,18]). 

Concerning the Kalman filter used, a brief description follows: A single forecasted 
parameter in time was utilized: the significant wave height (swh). The corresponding bias is 
estimated as a polynomial of the forecasting model direct output. This choice has been already 
used in previous applications of Kalman filtering for other meteorological parameters 
(temperature [6], wind speed [7]) resulting to the considerable reduction of the systematic error. 
To be more precise, let swhi denote the direct output of the model at time t,. Then, the 

corresponding bias yf is estimated by means of swhj in a linear form : 

yf = a^.+a,.-swh.+s. (7) 

The coefficients (a^ ., a^ .) are the parameters that have to be estimated by the filter while e, is 
the Gaussian, non systematic, error of the procedure. 

In this way, the state vector of the filter becomes x(f.) = . aj.] , the observation is 

the (scalar) bias yf, the observation matrix takes the form Hj = [l swh^^ and as system matrix 
the identity I 2 is used. Therefore, the system and observation equations take the following form: 

x‘ ) = X* (f,.) + Tj{t .), = H, [x‘ {t ,)] + s, (8) 

The initial value of the state vector x is zero, assuming, in this way, that the initial bias of 
the forecasting model is non-systematic: y^ =Sq (equations 7, 8). The covariance matrix P 

(equation 4b) is considered initially diagonal, indicating no correlations between different 
coordinates of the state vector x. The diagonal elements have an initially relatively large value. 


here we propose 



, that declares low credibility of the first guess. The initial values 
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of the variances Q(ti), R(t\) (equations 4b, 6) are Q(to)=/ 2 , /?(to)=6 (a sufficiently large estimation 


leading to quick independence from initial conditions). The selection of the above values, leads 
also to an initial Kalman gain that contributes to the fast adaptability of the filter to possible new 
conditions (equation 6). 

The subsequent values of Q(ti) and R{t\) are based on the sample of the last 7 values of 
= and s, = -H.[x\t.)] respectively: 

) = T • Z (((^‘ (^,>1) - (f .)) - (- -=-)))' ^ (9) 

6 /=0 ' 

) = 7 ■ S (((y° - [x' ft )1) - ;-)»’ ■ (w) 

These are objective estimators of Qiti}, R(ti) respectively due to the fact that the variables Tj(t.) 
and Si, denote the non-systematic part of errors in equations (8) and follow the normal 
distribution. 

The time period of 7 time steps was chosen after a sensitivity analysis that has been made 
for different meteorological parameters and led to the conclusion that this short time interval is 
adequate to obtain significantly improved forecasts with the application of the filter (for the 
relevant tests see [7]). On the other hand, this choice allows fast adaptability to possible data 
alternations and does not create needs for extended data storage. 

It worth also noting that this study was based on an operational run. More specifically, all 
the models were used iteratively and the observations of each day d were combined with model 
forecasts for the same day d and the next one d+1 so to achieve a Kalman-filter forecast for day 
d+1 (figure 2a). These new forecasts were evaluated against the observations of the new day d+1 
when these were available (fig 2b). In this way, it was ensured that the evaluation data were not 
mixed with those used for forecasting. 

***** Desired Location of Figure 2 ******** 

The statistical analysis was based on the following parameters: 
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Bias of forecasted (filtered or not) values: 


1 ^ 

Bias - — • forii) - obs(i)^ (11) 

k i=\ 

Here obs(i) denotes the recorded (observed) value at time i, for(i) the respective forecast 
(direct model output or improved forecast via the proposed filter) and k the size of the 
sample. Bias is the most crucial parameter for any type of filtering procedure since they all 
aim at eliminating the standard error. 

■ Mean Average Percentage Error: 


1 k 

mape=-Y 

k ^ 

i=\ 


for(i) - obs{i) 


obs(i) 


( 12 ) 


where | | stands for the absolute value. This parameter measures the divergence of the 

forecasts as a proportion of the observations. 


Root Mean Square Error: 


RMSE = 



■Yiforii) - obs{i)f 


(13) 


a classical and widely used divergence measure. 


4. Results 

As already discussed in the previous sections, one of the main problems in numerical wave 
forecasting is the difficulty in providing accurate local predictions which are crucial for several 
applications. A main and popular tool to encounter this issue is the Kalman filter which provides 
fast and accurate adaptation to local conditions by recursively combining direct model outputs 
with recent corresponding observations (see Sections 2, 3). However, an aspect that should be 
seriously taken into account when using Kalman filter post-processing is the prerequisite 
demanding the time series employed be of the same qualitative characteristics. Kalman filters may 
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detect and subtract possible systematic error that emerges between model and observation time 


series no matter its magnitude or type (underestimation or overestimation). However, both series 
used have to follow a "similar", qualitatively speaking, evolution in time. Filtering a smooth and 
continuous time series by using a corresponding noisy one with increased variability and 
discontinuities is risky and may lead to serious instabilities in the corresponding results. A relevant 
example is presented in Figure 3: 

***** Desired Location of Figure 3 ******** 

The time series of significant wave height values as forecasted by WAM and the corresponding 
buoy A and E records are plotted. It is obvious in both cases that, although the model follows the 
general pattern of the observations, the two time series are of totally different qualitative 
characteristics. The prediction model outputs are much smoother and continuous than the 
observations something expected since WAM forecasts, as any numerical prediction model 
results, are smoothed in time and space. On the other hand, the buoy time series are point 
records with no smoothing procedure applied. Therefore, although a systematic error is almost 
obvious, if one tries to pass these time series through a Kalman filter, the latter being unable to 
compare them in an appropriate way, produces filtered values with serious instabilities, setting 
under question its validity. In figures 4a and 4b a relevant example is presented. 

***** Desired Location of Figures 4a and 4b ******** 

Two time series of different qualitative characteristics (fig. 4a) are filtered by a linear Kalman 
algorithm. Several moderate or major instabilities emerge (fig. 4b). 

In order to further clarify this argument, a variability index is presented in Table 2 for both 
model forecasts and observations. This index measures the "distance" of the initial values 
(x(i))i=i, 2 ,...,n from their corresponding KZ-filter counterparts (y(i))i=i, 2 ,...,n and is calculated as follows: 
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(14) 


Var(Kj= 

It is a rather similar index to the well known Root Mean Square Error measuring the divergence of 
the time series in study from the corresponding KZ-filtered values instead of their mean. 

***** Desired Location ofTabie 2 ******** 

In all cases the variability of the observations are on average more than double the corresponding 
model predictions. It is important to underline that this is not a strange or extreme situation. 
However, it is a serious problem if one wants to filter these time series through a Kalman process 
in order to extract possible systematic error. 

A way out of the above difficulties can be given by the use of Kolmogorov-Zurbenko (KZ) 
filters presented in Sections 2 and 3. If the two time series of interest, WAM predictions and the 
corresponding observation records, pass through such a filter, then the high frequencies and the 
undesired variances are subtracted. In Figure 5, the time series of Figure 3 are presented after the 
application of a KZ-filter. 

***** Desired Location of Figure 5 ******** 

It is obvious that the resulting time series have much more similar qualitative characteristics while 
the existing systematic divergence between forecasts and observations becomes more evident. 
Therefore, these time series are more appropriate to be utilized by a Kalman filter. 

It worth noticing here that, given the smoothness procedure used by numerical prediction 
models, one could avoid KZ filtering the forecasts, restricting this only to observations. However, it 
is the author's belief that, by filtering both time series, their compatibility is further ensured. Note 
also that for both filters the corresponding parameters are those defined in Section 3 where the 
details of our tests are presented. The corresponding training period has been restricted to a 
seven day interval, exploiting the ability of Kalman filters to easily adopt possible new conditions 
as well as their limited need for background information. 
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The above described procedure was applied to the six available buoys. The filters 


performed well in all cases eliminating the major part of the systematic error, despite its type, and 
leading to more accurate local forecasts. In Table 3 some statistical results for the area of interest 
before and after the filters application are presented. 

***** Desired Location of Tabie 3 ******** 

In all cases, the bias has almost vanished, fulfilling the main goal of any Kalman type filter. 
On the other hand, RMSE is significantly decreased and the MAPE, which gives the discrepancies of 
the forecasts as a percentage of the observations, is reduced to less than the half of its initial 
value. It is worth noting that such improvements have not been achieved by our group when using 
only Kalman filters ([4, 6, 7]). The stride has been taken due to the combined used of Kalman with 
KZ-filters that ensure the best adaptability of the time series in use to the Kalman algorithm. The 
statistical results are graphically represented in Figures 6-8. 

***** Desired Location of Figures 6-8 ******** 

In order to further support the above arguments, the time series of three different cases 
(buoys C, D, F) are presented in Figures 9-11. The improvement of the initial forecast by the 
elimination of the systematic error is obvious. 

***** Desired Location of Figures 9-11 ******** 

In Figure 12, the added value obtained from the combined use of KZ and Kalman filters is clarified. 
The instabilities produced by the use of Kalman filter only (circled) are eased by the prior 
utilization of a KZ(5,5) smoothing filter. 


5. Conclusions 

A new methodology is proposed for the adaptation of the results of numerical wave 
prediction models to local wave conditions. It is based on the combination of two independent 
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statistical techniques: The Kolmogorov-Zurbenko and Kalman filters. The first transforms the time 


series used - model direct outputs and corresponding observations - into a comparable mode. 
Comparability is achieved by subtracting high variability which is normally present only in the 
observations since the forecasts are already smoothed spatially and temporally by the model 
itself. In a second step, the KZ-smoothed series are elaborated by a Kalman filter which may 
identify and subtract possible existing systematic errors. 

The proposed methodology was applied to an open sea area (south west coast of the 
United States) and has been evaluated by means of six buoys. In all cases, a considerable reduction 
of the systematic error was achieved no matter its form (under - or over - estimation). The 
corresponding biases were practically vanished while variability indexes (Root Mean Square Error 
and Mean Average Percentage Error) were also noticeably decreased. 

It is worth noticing that a substantial part of the success of the methodology presented is 
due to the presence of KZ filters. Without the latter the Kalman algorithm may produce serious 
instabilities due to the different qualitative characteristics of the initial time series. It is the 
author's belief that the proposed techniques may also give similar satisfactory results if applied to 
other atmospheric or wave parameters like temperature, wind speed, mean wave period, etc., 
contributing to the improvement of local meteorological predictions. 
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Figures and Tables 



Figure 1. The area of interest and the locations of buoys used (A-F) 


Buoy Lebel 

Lat 

Lon 

A 

N 34.88 

W 120.87 

B 

N 33.65 

W 120.2 

C 

N 33.75 

W 119.08 

D 

N 33.22 

W 119.88 

E 

N 32.43 

W 119.53 

F 

N 32.5 

W 118 


Table 1. Buoy's coordinates 
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Observation Period Forecasting Period Observation Period Forecasting Period 


(Day d) (Day d+1) (Day d) (Day d+1) 

(model) (observations) (Kalman) 


2a. Available observations and WAM forecasts are 
combined by the filters in order to reach 
a new improved forecast for the next day 

Figure 2. 


2b. The direct model outputs as well as the filtered 
forecasts are evaluated against next day 
observations 
operational run 



TioM* (houn) 
obs —mod 



Tirm* (houn) 
obs —mod 


Figure 3. Direct model outputs and observations from buoys (A) and (E). 
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Figure 4a. Direct model output and observations. 
The time series are of different characteristics. 



Figure 4b. Instabilities are produced by the direct 
application of a Kalman filter. 



Buoy A 


Buoy B 


Buoy C 




Model 

Obs 

Model 

Obs 

Model 

Obs 

Var Index 

0.07 

0.22 

0.17 

0.29 

0.13 

0.18 


Buoy D 


Buoy E 


Buoy F 


Average 



Model 

Obs 

Model 

Obs 

Model 

Obs 

Model 

Obs 

VarIndex 

0.16 

0.24 

0.17 

0.31 

0.14 

0.18 

0.14 

0.24 


Table 2. Variability Index for the forecasted and observed values. 




Time (hours) 



Time (hours) 


—obs — mtxl 




Figure 5. WAM forecasts and observations from buoys (A) and (E) after passing a (5,5)-KZ filter. 
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Buoy A 

Buoy B 

Buoy C 



Model 

Model + 
Filters 

Model 

Model + 
Filters 

Model 

Model + 
Filters 

Bias 

0.51 

-0.08 

0.13 

-0.03 

0.99 

0.03 

RMSE 

0.74 

0.63 

0.74 

0.72 

1.12 

0.44 

MAPE 

0.42 

0.26 

0.31 

0.25 

1.05 

0.30 


Buoy D 

Buoy E 

Buoy F 

Average 

Model 

Model + 
Filters 

Model 

Model + 
Filters 

Model 

Model + 
Filters 

Model 

Model + 
Filters 

Bias 

0.56 

0.002 

0.36 

-0.02 

0.70 

0.01 

0.54 

-0.01 

RMSE 

0.84 

0.61 

0.82 

0.69 

0.89 

0.49 

0.86 

0.60 

MAPE 

0.43 

0.21 

0.38 

0.23 

0.63 

0.23 

0.54 

0.25 


Table 3. Statistics for all buoy locations before and after the use of the filters referring to all study period. 



Figure 6. Bias of WAM direct outputs and WAM+Filters. 



Figure 7. Root Mean Square Error of WAM direct outputs and WAM+Filters. 
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Figure 8. Mean Average Percentage Error of WAM direct outputs and WAM+Filters. 
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Figure 9. WAM direct outputs, KZ+KaInnan innproved forecasts and the corresponding observations fronn buoy C. 


20 


















- obs - WAM —WAM f Filters 


Figure 10. WAM direct outputs, KZ+Kalman improved forecasts and the corresponding observations from buoy D. 



Figure 11. WAM direct outputs, KZ+Kalman improved forecasts and the corresponding observations from buoy F. 
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Figure 12. WAM direct outputs (red line), Kalman filtered forecast (green line) and KZ+Kalman filtered forecasts 
(purple line) against the corresponding observations (blue line) from buoy A. 
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